{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "import warnings\n",
    "# Ignore numpy dtype warnings. These warnings are caused by an interaction\n",
    "# between numpy and Cython and can be safely ignored.\n",
    "# Reference: https://stackoverflow.com/a/40846742\n",
    "warnings.filterwarnings(\"ignore\", message=\"numpy.dtype size changed\")\n",
    "warnings.filterwarnings(\"ignore\", message=\"numpy.ufunc size changed\")\n",
    "\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import pandas as pd\n",
    "import seaborn as sns\n",
    "%matplotlib inline\n",
    "import ipywidgets as widgets\n",
    "from ipywidgets import interact, interactive, fixed, interact_manual\n",
    "import nbinteract as nbi\n",
    "\n",
    "sns.set()\n",
    "sns.set_context('talk')\n",
    "np.set_printoptions(threshold=20, precision=2, suppress=True)\n",
    "pd.options.display.max_rows = 7\n",
    "pd.options.display.max_columns = 8\n",
    "pd.set_option('precision', 2)\n",
    "# This option stops scientific notation for pandas\n",
    "# pd.set_option('display.float_format', '{:.2f}'.format)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Faithfulness\n",
    "\n",
    "We describe a dataset as \"faithful\" if we believe it accurately captures reality. Typically, untrustworthy datasets contain:\n",
    "\n",
    "**Unrealistic or incorrect values**\n",
    "\n",
    "For example, dates in the future, locations that don't exist, negative counts, or large outliers.\n",
    "\n",
    "**Violations of obvious dependencies**\n",
    "\n",
    "For example, age and birthday for individuals don't match.\n",
    "\n",
    "**Hand-entered data**\n",
    "\n",
    "As we have seen, these are typically filled with spelling errors and inconsistencies.\n",
    "\n",
    "**Clear signs of data falsification**\n",
    "\n",
    "For example, repeated names, fake looking email addresses, or repeated use of uncommon names or fields.\n",
    "\n",
    "Notice the many similarities to data cleaning. As we have mentioned, we often go back and forth between data cleaning and EDA, especially when determining data faithfulness. For example, visualizations often help us identify strange entries in the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CASENO</th>\n",
       "      <th>OFFENSE</th>\n",
       "      <th>EVENTDT</th>\n",
       "      <th>EVENTTM</th>\n",
       "      <th>...</th>\n",
       "      <th>BLKADDR</th>\n",
       "      <th>Latitude</th>\n",
       "      <th>Longitude</th>\n",
       "      <th>Day</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>17091420</td>\n",
       "      <td>BURGLARY AUTO</td>\n",
       "      <td>07/23/2017 12:00:00 AM</td>\n",
       "      <td>06:00</td>\n",
       "      <td>...</td>\n",
       "      <td>2500 LE CONTE AVE</td>\n",
       "      <td>37.876965</td>\n",
       "      <td>-122.260544</td>\n",
       "      <td>Sunday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>17038302</td>\n",
       "      <td>BURGLARY AUTO</td>\n",
       "      <td>07/02/2017 12:00:00 AM</td>\n",
       "      <td>22:00</td>\n",
       "      <td>...</td>\n",
       "      <td>BOWDITCH STREET &amp; CHANNING WAY</td>\n",
       "      <td>37.867209</td>\n",
       "      <td>-122.256554</td>\n",
       "      <td>Sunday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>17049346</td>\n",
       "      <td>THEFT MISD. (UNDER $950)</td>\n",
       "      <td>08/20/2017 12:00:00 AM</td>\n",
       "      <td>23:20</td>\n",
       "      <td>...</td>\n",
       "      <td>2900 CHANNING WAY</td>\n",
       "      <td>37.867948</td>\n",
       "      <td>-122.250664</td>\n",
       "      <td>Sunday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>17091319</td>\n",
       "      <td>THEFT MISD. (UNDER $950)</td>\n",
       "      <td>07/09/2017 12:00:00 AM</td>\n",
       "      <td>04:15</td>\n",
       "      <td>...</td>\n",
       "      <td>2100 RUSSELL ST</td>\n",
       "      <td>37.856719</td>\n",
       "      <td>-122.266672</td>\n",
       "      <td>Sunday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>17044238</td>\n",
       "      <td>DISTURBANCE</td>\n",
       "      <td>07/30/2017 12:00:00 AM</td>\n",
       "      <td>01:16</td>\n",
       "      <td>...</td>\n",
       "      <td>TELEGRAPH AVENUE &amp; DURANT AVE</td>\n",
       "      <td>37.867816</td>\n",
       "      <td>-122.258994</td>\n",
       "      <td>Sunday</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 9 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     CASENO                   OFFENSE                 EVENTDT EVENTTM   ...    \\\n",
       "0  17091420             BURGLARY AUTO  07/23/2017 12:00:00 AM   06:00   ...     \n",
       "1  17038302             BURGLARY AUTO  07/02/2017 12:00:00 AM   22:00   ...     \n",
       "2  17049346  THEFT MISD. (UNDER $950)  08/20/2017 12:00:00 AM   23:20   ...     \n",
       "3  17091319  THEFT MISD. (UNDER $950)  07/09/2017 12:00:00 AM   04:15   ...     \n",
       "4  17044238               DISTURBANCE  07/30/2017 12:00:00 AM   01:16   ...     \n",
       "\n",
       "                          BLKADDR   Latitude   Longitude     Day  \n",
       "0               2500 LE CONTE AVE  37.876965 -122.260544  Sunday  \n",
       "1  BOWDITCH STREET & CHANNING WAY  37.867209 -122.256554  Sunday  \n",
       "2               2900 CHANNING WAY  37.867948 -122.250664  Sunday  \n",
       "3                 2100 RUSSELL ST  37.856719 -122.266672  Sunday  \n",
       "4   TELEGRAPH AVENUE & DURANT AVE  37.867816 -122.258994  Sunday  \n",
       "\n",
       "[5 rows x 9 columns]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "calls = pd.read_csv('data/calls.csv')\n",
    "calls.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x1a1ebb2898>"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZUAAAEOCAYAAABB+oq7AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzt3XtUVPX+PvBnYIBhgOMIaCc8Xghc\nkIqoKF4ThOOFI2BHTVfe9WigKJqGmeal8JIX0MQiMaofipmYKy9RKXjrLEpTM8yUvIxmoiUQ4wWG\nkZn9+4OvM80hdQb2MDPwvNY6izP7s7fz3u82Pn5m9kUiCIIAIiIiEThYuwAiImo8GCpERCQahgoR\nEYmGoUJERKJhqBARkWgYKkREJBqrhEphYSH69eunf33r1i3MmDEDPXv2RN++fZGcnAyNRgMAEAQB\nKSkp6NWrF3r06IHly5dDq9Xqt/3oo4/w3HPPoVu3bnjllVdQUVHR4PtDREQ1GjRUBEHArl27MGXK\nFDx48EC/PCkpCX//+99x7NgxfPbZZzh79izeeecdAEB2djaOHDmCvXv3Ijc3F6dPn8b27dsBAIcP\nH0ZmZiaysrJw9OhRqFQqbNy4sSF3iYiI/kTakG/23nvv4YsvvkB8fDy2bNkCANBoNHB1dcX06dPh\n4uKCFi1aICYmBgcPHgQA7NmzBxMnTkTLli0BAHFxcdi0aRPGjx+PPXv2YOTIkfD19QUAzJ49G5Mm\nTUJSUhIcHR2fWM/t23fN3geJRAIvLzeUlt5HU79ulL0wYC9qsA8GjbkXLVp4PHKsQUNlxIgRiI+P\nx4kTJ/TLnJ2dkZGRYbTe4cOHERgYCAC4cuUK/P399WO+vr64dOkSBEHAlStXMHDgQKOxu3fv4rff\nfoOPj88T65FIJHAwc67m4CCBRCKBVCqBTmfeto0Ne2HAXtRgHwyaai8aNFQezjYeRRAErFixAleu\nXMHatWsBAJWVlZDJZPp1XF1dodPpoNFo/nLs4Tam8PJyg0QiMXc3AAAKhVudtmuM2AsD9qIG+2DQ\n1HrRoKHyOGq1GvPnz0dRURG2bt0KLy8vAIBMJkNVVZV+vcrKSkilUri4uPzlGAC4uZn2H7G09H6d\nZioKhRvKy+9Dp2tcU1pzsRcG7EUN9sGgMffC09P9kWM2ESrl5eWYOnUq5HI5PvnkEygUCv2Yn58f\nlEolgoODAQBKpRLPPPOMfuzKlSv6dZVKJTw8PJ44I3pIEAT86UQys+h0ArTaxnWg1BV7YcBe1GAf\nDJpaL6x+nYogCJg1axa8vb2RmZlpFCgAEBsbi8zMTNy6dQslJSXYvHkzhg0bph/75JNPcPHiRdy7\ndw8bN25ETEwMHMydfhARkSisPlP5/vvvceLECbi4uCA0NFS/vEOHDsjOzsaYMWNQUlKCkSNH4sGD\nB4iJicHkyZMBABEREfj1118RFxeHO3fuICwsDPPnz7fWrhARNXmSpvw8lbqcUuzoKIGnpzvKyu41\nqSntX2EvDNiLGuyDQWPuxeNOKebnREREJBqGChERiYahQkREorH6F/VERCSOKW8dMnndDxZEWKQG\nzlSIiEg0DBUiIhINQ4WIiETDUCEiItEwVIiISDQMFSIiEg1DhYiIRMNQISIi0TBUiIhINAwVIiIS\nDUOFiIhEw1AhIiLRMFSIiEg0DBUiIhINQ4WIiETDUCEiItEwVIiISDQMFSIiEg1DhYiIRMNQISIi\n0TBUiIhINAwVIiISjVVCpbCwEP369dO/VqlUSEhIQEhICMLDw5GTk6Mf02g0WLhwIUJDQ9GnTx+k\np6frxwRBQEpKCnr16oUePXpg+fLl0Gq1DbovRERk0KChIggCdu3ahSlTpuDBgwf65YsXL4ZcLkdB\nQQE2btyIdevW4cKFCwCA9evXo7i4GPn5+di+fTtycnJw6NAhAEB2djaOHDmCvXv3Ijc3F6dPn8b2\n7dsbcpeIiOhPGjRU3nvvPWRlZSE+Pl6/7P79+8jLy0NiYiJcXFzQuXNnREdH62cre/fuRVxcHDw8\nPNCuXTuMGzcOO3fuBADs2bMHEydORMuWLdGiRQvExcXpx4iIqOFJG/LNRowYgfj4eJw4cUK/7Nq1\na5BKpWjdurV+ma+vLw4cOACVSoWSkhL4+/sbjWVnZwMArly5Umvs0qVLEAQBEonkifVIJBI4mBmr\nDg4So59NGXthwF7UYB8MbL0Xjo6WqatBQ6Vly5a1llVUVEAmkxktk8lkUKvVqKysBAC4urrWGgOA\nyspKo21dXV2h0+mg0Wjg4uLyxHq8vNxMCp+/olC41Wm7xoi9MGAvarAPBrbaC09Pd4v8uQ0aKn/F\n1dVVHxIPqdVqyOVyfWCo1Wq4u7sbjQE1AVNVVaXfrrKyElKp1KRAAYDS0vt1mqkoFG4oL78PnU4w\nb+NGhr0wYC9qsA8Gtt6LsrJ7dd72cYFk9VBp27YtqqurUVxcDB8fHwCAUqmEv78/FAoFvLy8oFQq\n4e3trR/z8/MDAPj5+UGpVCI4OFg/9swzz5j83oIgoK4ni+l0ArRa2ztQrIG9MGAvarAPBrbaC0vV\nZPXrVNzd3REZGYmUlBRUVlaisLAQ+/fvR0xMDAAgNjYWaWlpKC8vx9WrV7Ft2zYMGzZMP5aZmYlb\nt26hpKQEmzdv1o8REVHDs/pMBQCSk5OxdOlShIWFQS6XIykpST/7mDNnDlauXImoqChIJBJMmDAB\nUVFRAIAxY8agpKQEI0eOxIMHDxATE4PJkydbc1eIiJo0iSAItjcvayC3b981extHRwk8Pd1RVnbP\nJqe0DYm9MGAvarAPBtboxZS3Dpm87gcLIur8Pi1aeDxyzOoffxERUePBUCEiItEwVIiISDQMFSIi\nEg1DhYiIRMNQISIi0TBUiIhINAwVIiISDUOFiIhEw1AhIiLRMFSIiEg0DBUiIhINQ4WIiETDUCEi\nItEwVIiISDQMFSIiEg1DhYiIRMNQISIi0TBUiIhINAwVIiISDUOFiIhEw1AhIiLRMFSIiEg0DBUi\nIhINQ4WIiETDUCEiItHYTKicPn0aw4cPR7du3TB48GDs27cPAKBSqZCQkICQkBCEh4cjJydHv41G\no8HChQsRGhqKPn36ID093VrlExERAKm1CwAArVaLhIQELF26FEOGDMHJkycxceJEdO3aFWvWrIFc\nLkdBQQGKioowbdo0BAUFITAwEOvXr0dxcTHy8/NRWlqKKVOmICAgABEREdbeJSKiJskmZip37txB\nWVkZtFotBEGARCKBk5MTHB0dkZeXh8TERLi4uKBz586Ijo7Wz1b27t2LuLg4eHh4oF27dhg3bhx2\n7txp5b0hImq6bGKm0rx5c4wZMwZz585FUlISdDodVqxYgT/++ANSqRStW7fWr+vr64sDBw5ApVKh\npKQE/v7+RmPZ2dkmv69EIoGDmbHq4CAx+tmUsRcG7EUN9sHA1nvh6GiZumwiVHQ6HWQyGd5++21E\nRESgoKAA8+bNQ3p6OmQymdG6MpkMarUalZWVAABXV9daY6by8nKDRFK3xioUbnXarjFiLwzYixrs\ng4Gt9sLT090if65NhMqBAwdQWFiIV199FQAQHh6O8PBwpKWl1QoJtVoNuVyuDxu1Wg13d3ejMVOV\nlt6v00xFoXBDefl96HSCeRs3MuyFAXtRg30wsPVelJXdq/O2jwskmwiVmzdvQqPRGC2TSqXo2LEj\nTp06heLiYvj4+AAAlEol/P39oVAo4OXlBaVSCW9vb/2Yn5+fye8rCAK02rrVrNMJ0Gpt70CxBvbC\ngL2owT4Y2GovLFWTTXxR36dPH5w/fx6ffvopBEHAiRMncPDgQQwdOhSRkZFISUlBZWUlCgsLsX//\nfsTExAAAYmNjkZaWhvLycly9ehXbtm3DsGHDrLw3RERNl8mh8ttvv1msiICAAGzcuBFZWVkICQnB\nm2++idWrVyMoKAjJycmorq5GWFgYEhMTkZSUhODgYADAnDlz0K5dO0RFRWHMmDEYNWoUoqKiLFYn\nERE9nkQQBJPmQB06dEBoaChiY2MxaNAg/fcY9uz27btmb+PoKIGnpzvKyu7Z5JS2IbEXBuxFDfbB\nwBq9mPLWIZPX/WBB3a/na9HC45FjJs9U9u3bh5CQEGRkZKBv375ITExEXl4eHjx4UOfCiIiocTE5\nVPz8/DBr1ix8+eWX2L59O9q0aYP169ejb9++WLJkCU6ePGnJOomIyA7U6Yt6X19fBAYGon379qiq\nqsLp06eRmJiI6OhonD17VuwaiYjITph8SnFVVRUOHz6M3NxcHDt2DM2aNcPQoUPxySefIDAwENXV\n1XjjjTcwe/ZsHDpk+ud6RETUeJgcKr169YKjoyMGDhyI9PR09OrVy+hqdKlUiueeew6nT5+2SKFE\nRGT7TA6VFStWIDIyEi4uLo9cZ9CgQRg0aJAohRERkf0x+TuV8PBwrFixApmZmfplQ4YMQXJyMqqq\nqixSHBER2ReTQyU5ORnff/89unfvrl+2cOFCfPfdd1izZo1FiiMiIvticqgcOnQIa9as0V/NDgD9\n+/fH8uXL8cUXX1ikOCIisi9mnVJcXV1da5mjoyM//iIiIgBmfqfyxhtv4OLFi/plly9fxooVKxAW\nFmaR4oiIyL6YfPbXokWLkJCQgJiYGP0ZYBqNBn379sXrr79usQKJiMh+mBwqf/vb37B161ZcunQJ\nly5dgpOTE9q1a2fW80uIiKhxM+shXVqtFs7Ozmjfvj0EQYAgCLh06RIAGD0rnoiImiaTQ+Xo0aNY\ntGgRSktLjZYLggCJRILz58+LXhwREdkXk0MlNTUV3bp1Q0JCQqN4lgoREYnP5FC5du0aUlNT+R0K\nERE9ksmnFAcFBRmdTkxERPS/TJ6pDBkyBEuXLsWJEyfQpk0bODk5GY2PHTtW9OKIiMi+mBwqmZmZ\ncHNzw5EjR2qNSSQShgoREZkeKnzwFhERPYlZ9/7SaDTYt28f0tLSUF5ejuPHj6OkpMRStRERkZ0x\neaZy/fp1TJw4EVqtFiUlJXj++eeRnZ2N48eP48MPP0SHDh0sWScREdkBk2cqK1asQL9+/XD48GE4\nOzsDqLl2JTw8HKtWrbJYgUREZD9MDpVTp05h0qRJcHAwbCKVSjF9+nT8+OOPFimOiIjsi8mh4uzs\nDJVKVWv59evX4ebmJmpRRERkn0wOldjYWCQnJ+OHH34AAJSVlSE/Px9Lly5FdHR0vQu5desW4uLi\n0K1bN/Tv3x9ZWVkAAJVKhYSEBISEhCA8PBw5OTn6bTQaDRYuXIjQ0FD06dMH6enp9a6DiIjqzuQv\n6ufNm4fU1FSMHz8eGo0Go0ePhlQqxYsvvoi5c+fWqwhBEDBjxgz07NkTmzZtwtWrVzF27Fh06tQJ\nH330EeRyOQoKClBUVIRp06YhKCgIgYGBWL9+PYqLi5Gfn4/S0lJMmTIFAQEBiIiIqFc9RERUNyaH\nilQqxfz58zF79mz88ssv0Gq1aNOmDeRyeb2L+OGHH/D777/jlVdegaOjI9q3b48dO3bAxcUFeXl5\n+Oqrr+Di4oLOnTsjOjoaOTk5WLx4Mfbu3Yt169bBw8MDHh4eGDduHHbu3MlQISKyEpND5eFzU4Ca\nK+ilUimKi4v1y+rzPJVz586hffv2WLt2Lfbt2wd3d3fEx8cjICAAUqkUrVu31q/r6+uLAwcOQKVS\noaSkxOh9fX19kZ2dbfL7SiQSOJh1pQ7g4CAx+tmUsRcG7EUN9sHA1nvh6GiZukwOlejoaEgkEgiC\nAKDmL+SHPx0cHOp1BphKpcLx48fRq1cvHD58GD/++COmTp2KjIwMyGQyo3VlMhnUajUqKysBAK6u\nrrXGTOXl5abfD3MpFDw54SH2woC9qME+GNhqLzw9LfMIE5NDJT8/3+i1VqvFL7/8grfffhuzZs2q\nVxHOzs5o1qwZ4uLiAADdunXD4MGDsXHjxlohoVarIZfL9WGjVqv1z3d5OGaq0tL7dZqpKBRuKC+/\nD51OMG/jRoa9MGAvarAPBrbei7Kye3Xe9nGBZHKotGrVqtayNm3awMPDAwsWLED//v3rVh1qPraq\nrKxEdXU1pNKakrRaLTp06ICTJ0+iuLgYPj4+AAClUgl/f38oFAp4eXlBqVTC29tbP2bO814EQYBW\nW7eadToBWq3tHSjWwF4YsBc12AcDW+2FpWoy89/ptbm4uBh9t1IXffv2xd/+9jekpKSguroap0+f\nxsGDBzFkyBBERkYiJSUFlZWVKCwsxP79+xETEwOg5jTnh/chu3r1KrZt24Zhw4bVd5eIiKiOTJ6p\n/NUX4Pfv38eePXsQEhJSryJkMhm2bt2KN998E3369IG7uztef/11dOnSBcnJyVi6dCnCwsIgl8uR\nlJSE4OBgAMCcOXOwcuVKREVFQSKRYMKECYiKiqpXLUREVHcS4eE370/wv6fpSiQSODk5ISgoCC+/\n/LL+4yl7cvv2XbO3cXSUwNPTHWVl92xyStuQ2AsD9qIG+2BgjV5Mecv0R5R8sKDul160aOHxyDE+\nT4WIiERTp+tUnqQ+16wQEZH9Mvs6FQC1rlV5SBAESCQSnD9/XsQSiYjIXpgcKmlpaUhNTUVSUhJC\nQkLg5OSEc+fOITk5GcOHD8fAgQMtWScREdkBk0Nl1apVWLNmDbp3765f1qNHDyxfvhwzZ87EpEmT\nLFEfERHZEZOvU1GpVPonPv5ZVVWV/pYpRETUtJkcKoMGDcKCBQtw7NgxlJSU4Pbt28jLy8PChQvx\n/PPPW7JGIiKyEyZ//LV48WIsWrQI06dPh06nAwA4OTlh/PjxmDNnjsUKJCIi+2FyqMjlcqxfvx53\n7tzB1atX4erqijZt2sDFxcWS9RERkR0x695fpaWl2L59O7Zv3w5PT0/k5+fj559/tlRtRERkZ0wO\nlZ9++glDhgzBkSNHsH//flRUVKCgoAAvvPACvvnmG0vWSEREdsLkUFm1ahUmTJiAHTt2wMnJCQCw\nfPlyjB8/HuvWrbNYgUREZD9MDpVz584hNja21vLRo0fj8uXLohZFRET2yeRQadasGW7cuFFr+Y8/\n/ghPT09RiyIiIvtkcqi8+OKLWLx4MXJzcwEA58+fR1ZWFpYtW4bRo0dbrEAiIrIfJp9S/NJLL8HN\nzQ1r165FZWUlEhMT4e3tjenTp2PixImWrJGIiOyEyaGSm5uLmJgYjB07FhUVFdDpdHB3d7dkbURE\nZGdM/vhr2bJluH37NoCaCyEZKERE9L9MDpVOnTrh2LFjlqyFiIjsnMkffzk7O2P16tXYtGkTfHx8\nat2eZdeuXaIXR0RE9sXkUOnUqRM6depkyVqIiMjOPTZUgoODcfjwYXh6emLmzJkAaq5LCQgI0F9V\nT0RE9NBjv1OpqqrSP4/+oQkTJuDWrVsWLYqIiOyTWXcpBlArZIiIiB4yO1SIiIgehaFCRESieeLZ\nX7t374ZcLte/1mq12LNnD5o3b2603tixY0UpqKSkBDExMVi5ciUGDBiAX3/9FYsWLUJhYSFatmyJ\nBQsWYMCAAQAAlUqFhQsX4ttvv4WHhwcSEhLwwgsviFIHERGZ77Gh4uPjg48//thombe3N3bv3m20\nTCKRiBYqixYtQnl5uf717Nmz0adPH7z//vsoKCjAyy+/jLy8PHh6emLx4sWQy+UoKChAUVERpk2b\nhqCgIAQGBopSCxERmeexoXLo0KGGqgMA8PHHH8PV1RVPP/00AODy5cv4+eefkZ2dDScnJ4SFhSE0\nNBSfffYZRo8ejby8PHz11VdwcXFB586dER0djZycHCxevLhB6yYiohomX/xoaVevXsWHH36InTt3\nYvjw4QCAK1euoFWrVpDJZPr1fH19cfHiRVy7dg1SqRStW7c2Gjtw4IDJ7ymRSOBg5rdKDg4So59N\nGXthwF7UYB8MbL0Xjo6WqcsmQqW6uhpJSUlYtGgRFAqFfnlFRQVcXV2N1pXJZFCr1aioqDAKmz+P\nmcrLyw0SSd0aq1C41Wm7xoi9MGAvarAPBrbaC09Py9wU2CZC5d1338Wzzz6LsLAwo+Wurq61QkKt\nVkMulz92zFSlpffrNFNRKNxQXn4fOl3TumZn4op8k9b7f4siLVyJ7WnKx8WfsQ8Gtt6LsrJ7dd72\ncYFkE6GSm5uL27dv658qee/ePcydOxfx8fG4ceMGNBoNnJ2dAQBKpRI9e/ZE27ZtUV1djeLiYvj4\n+OjH/P39TX5fQRCg1datZp1OgFZreweKLWjKfeFxUYN9MLDVXliqJpsIlS+//NLodUREBBYvXowB\nAwbgq6++woYNGzBnzhx88803OH78OJYuXQp3d3dERkYiJSUFy5cvx8WLF7F//35kZGRYaS+InmzK\nW6ad/PLBgggLV0JkGTZ/8WNaWhqKiorQu3dvrFy5Eqmpqfqzw5KTk1FdXY2wsDAkJiYiKSkJwcHB\nVq6YiKjpsomZyv/686nMrVq1QmZm5l+up1Ao8PbbbzdUWURE9AQ2GSpk35riRzymnsRA1NjZ/Mdf\nRERkPzhToSapKc6miBoCZypERCQazlQaMf5rnIgaGmcqREQkGs5UiB7D1NkeEdVgqJBd4Ed5RPaB\noUL817gNMue/CYOUbAm/UyEiItEwVIiISDT8+IusxhIfu/GjPCLr4kyFiIhEw1AhIiLRMFSIiEg0\nDBUiIhINQ4WIiETDUCEiItEwVIiISDQMFSIiEg1DhYiIRMNQISIi0TBUiIhINAwVIiISDUOFiIhE\nw1AhIiLR2EyonDx5Ei+88AJCQkLwz3/+Ezt27AAAqFQqJCQkICQkBOHh4cjJydFvo9FosHDhQoSG\nhqJPnz5IT0+3VvlERAQbeZ6KSqXCjBkz8PrrryM6Ohrnz5/H5MmT0aZNG+zYsQNyuRwFBQUoKirC\ntGnTEBQUhMDAQKxfvx7FxcXIz89HaWkppkyZgoCAAERE8PGqRETWYBMzleLiYoSFhSE2NhYODg7o\n2LEjevbsidOnTyMvLw+JiYlwcXFB586dER0drZ+t7N27F3FxcfDw8EC7du0wbtw47Ny508p7Q0TU\ndNnETOXZZ5/F2rVr9a9VKhVOnjyJgIAASKVStG7dWj/m6+uLAwcOQKVSoaSkBP7+/kZj2dnZJr+v\nRCKBg5mx6uAgMfpJZG2OjrZzLPL3w8DWe2Gp48YmQuXP7t69i/j4eP1sJSsry2hcJpNBrVajsrIS\nAODq6lprzFReXm6QSOrWWIXCrU7bEYnN09Pd2iXUwt8PA1vthaWOG5sKlevXryM+Ph6tW7fGhg0b\ncPny5VohoVarIZfLIZPJ9K/d3d2NxkxVWnq/TjMVhcIN5eX3odMJ5m1MZAFlZfesXYIefz8MbL0X\n9TluHhdINhMq586dw9SpUxEbG4tXX30VDg4OaNu2Laqrq1FcXAwfHx8AgFKphL+/PxQKBby8vKBU\nKuHt7a0f8/PzM/k9BUGAVlu3enU6AVqt7R0o1PTY4nHI3w8DW+2FpWqyiS/qS0pKMHXqVEyePBmv\nvfYaHP5v+uDu7o7IyEikpKSgsrIShYWF2L9/P2JiYgAAsbGxSEtLQ3l5Oa5evYpt27Zh2LBh1twV\nIqImzSZmKrt27UJZWRnS09ONrjWZMGECkpOTsXTpUoSFhUEulyMpKQnBwcEAgDlz5mDlypWIioqC\nRCLBhAkTEBUVZa3dICJq8iSCINjevKyB3L591+xtHB0l8PR0R1nZPZuc0v7ZlLcOWbsEsiEfLLD8\n9Vv29PthadbohTm/8/U5Hlq08HjkmE18/EVERI0DQ4WIiETDUCEiItEwVIiISDQMFSIiEg1DhYiI\nRMNQISIi0TBUiIhINAwVIiISDUOFiIhEw1AhIiLRMFSIiEg0DBUiIhINQ4WIiETDUCEiItEwVIiI\nSDQMFSIiEg1DhYiIRGMTz6gn8/AxwURkqzhTISIi0TBUiIhINAwVIiISDUOFiIhEw1AhIiLRMFSI\niEg0DBUiIhKN3YfKTz/9hJEjR6JLly4YNmwYzpw5Y+2SiIiaLLsOlaqqKsTHx2P48OH47rvvMH78\neMycORMajcbapRERNUl2HSrffvstHBwcMGbMGDg5OWHkyJFo3rw5Dh8+bO3SiIiaJLu+TYtSqYSf\nn5/RMl9fX1y8eBGDBw9+4vYSiQQOZsaqg4PE6CeRvXB0tPwxy98PA1vvhaWOB7sOlYqKCri6uhot\nk8lkUKvVJm3v7e1e5/dWKNzqvG197UsZZrX3JjKFNX8/bE1D9sIW/m6w64+/XF1dawWIWq2GXC63\nUkVERE2bXYfKM888A6VSabRMqVTC39/fShURETVtdh0qvXv3hkajwdatW/HgwQPs2rULJSUl6Nev\nn7VLIyJqkiSCIAjWLqI+Lly4gGXLlqGoqAht27bFsmXL0KVLF2uXRUTUJNl9qBARke2w64+/iIjI\ntjBUiIhINAwVIiISDUMFQGFh4SPPGFuyZAm6du2q/1+XLl0QEBCAffv2AXj8DS1//fVXTJw4EV27\ndsXgwYNt/vYxlurDxYsXMWHCBHTv3h1hYWHYtGkTbP2rPEv14iG1Wo0hQ4Zg27ZtFt2P+rJUH+7e\nvYt58+ahR48e6N27N1JTUxtkf+rDUr24fPkyxo0bh+7du2PAgAH46KOPGmJ3LEdownQ6nZCTkyOE\nhIQIoaGhJm2zYcMGYdy4cYJGoxHUarXw3HPPCdnZ2YJGoxFycnKEvn37ClVVVYIgCMLw4cOFdevW\nCRqNRjhy5IjQtWtXobS01JK7VCeW7INWqxUiIyOFzZs3Cw8ePBCuXbsmREZGCjt37rTwXtWNpY+J\nh5YtWyYEBgYKW7dutcRu1Jul+zBz5kxh7ty5QkVFhVBcXCxERkYKe/futeQu1Zmle/Hvf/9b+OCD\nDwSdTidcvHhRCAkJEU6cOGHJXbKoJh0q7777rhATEyNs2bLFpIPl7NmzQkhIiFBcXCwIgiAcOXJE\nCAsLM1onOjpa+PLLL4VLly4JnTp1EiorK/VjcXFxQmZmpqj7IAZL9uHWrVvC1KlTBa1Wqx9btWqV\nsGDBAlH3QSyW7MVDR48eFUaPHi28+OKLNhsqlj4mOnXqJNy5c0c/dv36deG3334TdR/EYuljokuX\nLsLmzZuF6upq4eLFi0JoaKjSB/L9AAAFZUlEQVRw5swZ0fejoTTpj79GjBiBPXv2ICgoyKT1V61a\nhZdeeglPP/00gMff0PLKlSto1aoVZDJZrTFbY8k+PPXUU9iyZQsc/u/OnRqNBl9//TUCAwPF3QmR\nWLIXAPDHH39g+fLlWL16NRwdHcUtXkSW7MOFCxfQqlUrfPzxxwgPD0dERARyc3PRsmVL0fdDDJY+\nJqZPn44NGzYgKCgIQ4cOxbhx4xAcHCzuTjSgJh0qLVu2hERi2p06T506hUuXLmHs2LH6ZY+7oWV9\nb3bZkCzZhz/TaDSYN28enJycMHr06PoXbgGW7sWSJUswadIktG3bVryiLcCSfSgvL8cvv/yCmzdv\n4osvvkBGRga2bduGPXv2iLoPYrH0MSGRSLBo0SKcOXMGO3bsQHZ2No4ePSreDjSwJh0q5ti9ezdi\nY2Ph5ma44+jjbmjZWG92aW4fHvrjjz8wefJk/P777/jwww+NZnD2ytxefPrpp6ioqMCYMWMaulSL\nMrcPzs7O0Ol0eOWVV+Dq6gp/f3+MGjUK+fn5DV266MztxdmzZ5GdnY2xY8fC2dkZXbt2xahRo7Br\n166GLl00DBUTHT58GFFRUUbLHndDSz8/P9y4ccPoKZSN4WaX5vYBqDkLbtSoUXjqqaeQlZWF5s2b\nN1i9lmRuL3Jzc/H999+je/fu6N69O06dOoW1a9di2bJlDVi1+Mztg6+vLwRBwL179/RjWq3W5s8I\nNIW5vbh582atJ9VKpVJIpfb7VBKGigmuX7+OO3fuoFOnTkbLH3dDSz8/P/j7+2PDhg3QaDQ4evQo\njh8/jiFDhlhpL+qvLn1Qq9WYOnUq+vbti9TUVLi4uFipenHVpReZmZk4ffo0Tp48iZMnTyIkJARJ\nSUl2HSp16UNgYCA6duyI1atXQ61W4/Lly8jJyan1l7G9qUsvunXrBo1Gg3feeQdarRYXLlzAzp07\n8a9//ctKe1F/DJW/sGTJEixZskT/+saNG2jWrBmcnZ2N1nN2dsaWLVvw+eefIzQ0FNu2bUN6err+\nY5+0tDQUFRWhd+/eWLlyJVJTU/Vf3tkDMfpw8OBBKJVKfPbZZ0bn8SclJTX07tSLWMeEvROrDxkZ\nGdBqtQgPD8fEiRMxfvx4u/uLVIxeeHt7IyMjA19//TVCQ0Mxa9YsJCQkYODAgQ29O6LhDSWJiEg0\nnKkQEZFoGCpERCQahgoREYmGoUJERKJhqBARkWjs9wobIiIyS2FhIWbMmIH//ve/T1x36NChKC4u\n1r+urq6GRqPBsWPH8NRTTz1yO4YKEVEjJwgCPv30U7z11lsm38j0888/1/9/nU6HSZMmoWvXro8N\nFIAffxERNXrvvfcesrKyEB8fb7S8vLwcSUlJ6N27NyIiIpCRkfGXt8vJysrCvXv3kJiY+MT34kyF\niKiRGzFiBOLj43HixAmj5fPnz4dCoUB+fj7KysoQHx8PLy8vjBgxQr+OSqXCpk2b8P7775s0y+FM\nhYiokfur2/ffvn0bx44dw2uvvQa5XI5//OMf+M9//oOcnByj9bZv347g4GB06dLFpPfiTIWIqAm6\nefMmBEEwus+YTqeDQqEwWm/37t149dVXTf5zGSpERE1QixYtIJVKUVBQoL8Jpkqlwv379/XrXL58\nGSUlJejfv7/Jfy4//iIiaoKefvpphISEYO3atfonciYmJmL9+vX6dc6cOYOOHTvWuvPy4zBUiIia\nqNTUVJSWliIiIgKDBw9Gy5YtsXTpUv34jRs30KJFC7P+TN76noiIRMOZChERiYahQkREomGoEBGR\naBgqREQkGoYKERGJhqFCRESiYagQEZFoGCpERCQahgoREYnm/wNtA4p9o894IwAAAABJRU5ErkJg\ngg==\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1a1ea32a90>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "calls['CASENO'].plot.hist(bins=30)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice the unexpected clusters at 17030000 and 17090000. By plotting the distribution of case numbers, we can quickly see anomalies in the data. In this case, we might guess that two different teams of police use different sets of case numbers for their calls.\n",
    "\n",
    "Exploring the data often reveals anomalies; if fixable, we can then apply data cleaning techniques."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  },
  "toc": {
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": true,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
